A Novel Generalized Value Iteration Scheme For Partially-Unknown Continuous-Time Linear Systems
نویسندگان
چکیده
In this paper, a novel generalized value iteration technique is presented for solving online the discounted linear quadratic (LQ) optimal control problems for continuous-time (CT) linear systems with an unknown system matrix A. In the proposed method, the discounted value function is considered, which is a general setting in reinforcement learning (RL) frameworks, but not fully considered in RL for CT dynamical systems. Moreover, stepwise-varying learning rate ηi is introduced for the fast and safe convergence. For the stability and monotone convergence to the true optimal solution, it is mathematically proven that if stepwise-varying learning rate ηi lies in some specified ranges, the proposed algorithm guarantees the so-called Hurwitz property concerning the stability of the closed loop system, and, in addition, converges to the discounted LQ optimal solution. These proofs also give the stability and monotone convergence conditions for the existing value iteration method as a special case since the proposed method is more general than the existing ones.
منابع مشابه
Analytical and Verified Numerical Results Concerning Interval Continuous-time Algebraic Riccati Equations
This paper focuses on studying the interval continuous-time algebraic Riccati equation A∗X + XA + Q − XGX = 0, both from the theoretical aspects and the computational ones. In theoretical parts, we show that Shary’s results for interval linear systems can only be partially generalized to this interval Riccati matrix equation. We then derive an efficient technique for enclosing the united stable...
متن کاملOptimal adaptive leader-follower consensus of linear multi-agent systems: Known and unknown dynamics
In this paper, the optimal adaptive leader-follower consensus of linear continuous time multi-agent systems is considered. The error dynamics of each player depends on its neighbors’ information. Detailed analysis of online optimal leader-follower consensus under known and unknown dynamics is presented. The introduced reinforcement learning-based algorithms learn online the approximate solution...
متن کاملEigenvalue Assignment Of Discrete-Time Linear Systems With State And Input Time-Delays
Time-delays are important components of many dynamical systems that describe coupling or interconnection between dynamics, propagation or transport phenomena, and heredity and competition in population dynamics. The stabilization with time delay in observation or control represents difficult mathematical challenges in the control of distributed parameter systems. It is well-known that the stabi...
متن کاملNewton iterations in implicit time-stepping scheme for differential linear complementarity systems
We propose a generalized Newton method for solving the system of nonlinear equations with linear complementarity constraints in the implicit or semi-implicit time-stepping scheme for differential linear complementarity systems (DLCS). We choose a specific solution from the solution set of the linear complementarity constraints to define a locally Lipschitz continuous right-hand-side function in...
متن کاملA New Inexact Inverse Subspace Iteration for Generalized Eigenvalue Problems
In this paper, we represent an inexact inverse subspace iteration method for computing a few eigenpairs of the generalized eigenvalue problem Ax = Bx [Q. Ye and P. Zhang, Inexact inverse subspace iteration for generalized eigenvalue problems, Linear Algebra and its Application, 434 (2011) 1697-1715 ]. In particular, the linear convergence property of the inverse subspace iteration is preserved.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011